Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 65.745
Filter
1.
An. psicol ; 40(2): 344-354, May-Sep, 2024. ilus, tab, graf
Article in Spanish | IBECS | ID: ibc-232727

ABSTRACT

En los informes meta-analíticos se suelen reportar varios tipos de intervalos, hecho que ha generado cierta confusión a la hora de interpretarlos. Los intervalos de confianza reflejan la incertidumbre relacionada con un número, el tamaño del efecto medio paramétrico. Los intervalos de predicción reflejan el tamaño paramétrico probable en cualquier estudio de la misma clase que los incluidos en un meta-análisis. Su interpretación y aplicaciones son diferentes. En este artículo explicamos su diferente naturaleza y cómo se pueden utilizar para responder preguntas específicas. Se incluyen ejemplos numéricos, así como su cálculo con el paquete metafor en R.(AU)


Several types of intervals are usually employed in meta-analysis, a fact that has generated some confusion when interpreting them. Confidence intervals reflect the uncertainty related to a single number, the parametric mean effect size. Prediction intervals reflect the probable parametric effect size in any study of the same class as those included in a meta-analysis. Its interpretation and applications are different. In this article we explain in de-tail their different nature and how they can be used to answer specific ques-tions. Numerical examples are included, as well as their computation with the metafor Rpackage.(AU)


Subject(s)
Humans , Male , Female , Confidence Intervals , Forecasting , Data Interpretation, Statistical
2.
Biom J ; 66(4): e2300084, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38775273

ABSTRACT

The cumulative incidence function is the standard method for estimating the marginal probability of a given event in the presence of competing risks. One basic but important goal in the analysis of competing risk data is the comparison of these curves, for which limited literature exists. We proposed a new procedure that lets us not only test the equality of these curves but also group them if they are not equal. The proposed method allows determining the composition of the groups as well as an automatic selection of their number. Simulation studies show the good numerical behavior of the proposed methods for finite sample size. The applicability of the proposed method is illustrated using real data.


Subject(s)
Models, Statistical , Humans , Incidence , Biometry/methods , Risk Assessment , Computer Simulation , Data Interpretation, Statistical
3.
Stat Med ; 43(11): 2062-2082, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38757695

ABSTRACT

This paper discusses regression analysis of interval-censored failure time data arising from semiparametric transformation models in the presence of missing covariates. Although some methods have been developed for the problem, they either apply only to limited situations or may have some computational issues. Corresponding to these, we propose a new and unified two-step inference procedure that can be easily implemented using the existing or standard software. The proposed method makes use of a set of working models to extract partial information from incomplete observations and yields a consistent estimator of regression parameters assuming missing at random. An extensive simulation study is conducted and indicates that it performs well in practical situations. Finally, we apply the proposed approach to an Alzheimer's Disease study that motivated this study.


Subject(s)
Alzheimer Disease , Computer Simulation , Models, Statistical , Humans , Regression Analysis , Data Interpretation, Statistical
4.
Elife ; 122024 May 13.
Article in English | MEDLINE | ID: mdl-38739437

ABSTRACT

In several large-scale replication projects, statistically non-significant results in both the original and the replication study have been interpreted as a 'replication success.' Here, we discuss the logical problems with this approach: Non-significance in both studies does not ensure that the studies provide evidence for the absence of an effect and 'replication success' can virtually always be achieved if the sample sizes are small enough. In addition, the relevant error rates are not controlled. We show how methods, such as equivalence testing and Bayes factors, can be used to adequately quantify the evidence for the absence of an effect and how they can be applied in the replication setting. Using data from the Reproducibility Project: Cancer Biology, the Experimental Philosophy Replicability Project, and the Reproducibility Project: Psychology we illustrate that many original and replication studies with 'null results' are in fact inconclusive. We conclude that it is important to also replicate studies with statistically non-significant results, but that they should be designed, analyzed, and interpreted appropriately.


Subject(s)
Bayes Theorem , Reproducibility of Results , Humans , Research Design , Sample Size , Data Interpretation, Statistical
5.
Biometrics ; 80(2)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38742907

ABSTRACT

We propose a new non-parametric conditional independence test for a scalar response and a functional covariate over a continuum of quantile levels. We build a Cramer-von Mises type test statistic based on an empirical process indexed by random projections of the functional covariate, effectively avoiding the "curse of dimensionality" under the projected hypothesis, which is almost surely equivalent to the null hypothesis. The asymptotic null distribution of the proposed test statistic is obtained under some mild assumptions. The asymptotic global and local power properties of our test statistic are then investigated. We specifically demonstrate that the statistic is able to detect a broad class of local alternatives converging to the null at the parametric rate. Additionally, we recommend a simple multiplier bootstrap approach for estimating the critical values. The finite-sample performance of our statistic is examined through several Monte Carlo simulation experiments. Finally, an analysis of an EEG data set is used to show the utility and versatility of our proposed test statistic.


Subject(s)
Computer Simulation , Models, Statistical , Monte Carlo Method , Humans , Electroencephalography/statistics & numerical data , Data Interpretation, Statistical , Biometry/methods , Statistics, Nonparametric
6.
Trials ; 25(1): 317, 2024 May 14.
Article in English | MEDLINE | ID: mdl-38741218

ABSTRACT

BACKGROUND: Surgical left atrial appendage (LAA) closure concomitant to open-heart surgery prevents thromboembolism in high-risk patients. Nevertheless, high-level evidence does not exist for LAA closure performed in patients with any CHA2DS2-VASc score and preoperative atrial fibrillation or flutter (AF) status-the current trial attempts to provide such evidence. METHODS: The study is designed as a randomized, open-label, blinded outcome assessor, multicenter trial of adult patients undergoing first-time elective open-heart surgery. Patients with and without AF and any CHA2DS2-VASc score will be enrolled. The primary exclusion criteria are planned LAA closure, planned AF ablation, or ongoing endocarditis. Before randomization, a three-step stratification process will sort patients by site, surgery type, and preoperative or expected oral anticoagulation treatment. Patients will undergo balanced randomization (1:1) to LAA closure on top of the planned cardiac surgery or standard care. Block sizes vary from 8 to 16. Neurologists blinded to randomization will adjudicate the primary outcome of stroke, including transient ischemic attack (TIA). The secondary outcomes include a composite outcome of stroke, including TIA, and silent cerebral infarcts, an outcome of ischemic stroke, including TIA, and a composite outcome of stroke and all-cause mortality. LAA closure is expected to provide a 60% relative risk reduction. In total, 1500 patients will be randomized and followed for 2 years. DISCUSSION: The trial is expected to help form future guidelines within surgical LAA closure. This statistical analysis plan ensures transparency of analyses and limits potential reporting biases. TRIAL REGISTRATION: Clinicaltrials.gov, NCT03724318. Registered 26 October 2018, https://clinicaltrials.gov/study/NCT03724318 . PROTOCOL VERSION: https://doi.org/10.1016/j.ahj.2023.06.003 .


Subject(s)
Atrial Appendage , Atrial Fibrillation , Cardiac Surgical Procedures , Multicenter Studies as Topic , Randomized Controlled Trials as Topic , Stroke , Humans , Atrial Appendage/surgery , Atrial Fibrillation/surgery , Atrial Fibrillation/complications , Stroke/prevention & control , Stroke/etiology , Cardiac Surgical Procedures/adverse effects , Risk Factors , Treatment Outcome , Risk Assessment , Data Interpretation, Statistical , Ischemic Attack, Transient/prevention & control , Ischemic Attack, Transient/etiology , Male , Female , Left Atrial Appendage Closure
7.
Trials ; 25(1): 312, 2024 May 09.
Article in English | MEDLINE | ID: mdl-38725072

ABSTRACT

BACKGROUND: Clinical trials often involve some form of interim monitoring to determine futility before planned trial completion. While many options for interim monitoring exist (e.g., alpha-spending, conditional power), nonparametric based interim monitoring methods are also needed to account for more complex trial designs and analyses. The upstrap is one recently proposed nonparametric method that may be applied for interim monitoring. METHODS: Upstrapping is motivated by the case resampling bootstrap and involves repeatedly sampling with replacement from the interim data to simulate thousands of fully enrolled trials. The p-value is calculated for each upstrapped trial and the proportion of upstrapped trials for which the p-value criteria are met is compared with a pre-specified decision threshold. To evaluate the potential utility for upstrapping as a form of interim futility monitoring, we conducted a simulation study considering different sample sizes with several different proposed calibration strategies for the upstrap. We first compared trial rejection rates across a selection of threshold combinations to validate the upstrapping method. Then, we applied upstrapping methods to simulated clinical trial data, directly comparing their performance with more traditional alpha-spending and conditional power interim monitoring methods for futility. RESULTS: The method validation demonstrated that upstrapping is much more likely to find evidence of futility in the null scenario than the alternative across a variety of simulations settings. Our three proposed approaches for calibration of the upstrap had different strengths depending on the stopping rules used. Compared to O'Brien-Fleming group sequential methods, upstrapped approaches had type I error rates that differed by at most 1.7% and expected sample size was 2-22% lower in the null scenario, while in the alternative scenario power fluctuated between 15.7% lower and 0.2% higher and expected sample size was 0-15% lower. CONCLUSIONS: In this proof-of-concept simulation study, we evaluated the potential for upstrapping as a resampling-based method for futility monitoring in clinical trials. The trade-offs in expected sample size, power, and type I error rate control indicate that the upstrap can be calibrated to implement futility monitoring with varying degrees of aggressiveness and that performance similarities can be identified relative to considered alpha-spending and conditional power futility monitoring methods.


Subject(s)
Clinical Trials as Topic , Computer Simulation , Medical Futility , Research Design , Humans , Clinical Trials as Topic/methods , Sample Size , Data Interpretation, Statistical , Models, Statistical , Treatment Outcome
8.
West J Nurs Res ; 46(6): 403, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38733127
9.
Trials ; 25(1): 296, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38698442

ABSTRACT

BACKGROUND: The optimal amount and timing of protein intake in critically ill patients are unknown. REPLENISH (Replacing Protein via Enteral Nutrition in a Stepwise Approach in Critically Ill Patients) trial evaluates whether supplemental enteral protein added to standard enteral nutrition to achieve a high amount of enteral protein given from ICU day five until ICU discharge or ICU day 90 as compared to no supplemental enteral protein to achieve a moderate amount of enteral protein would reduce all-cause 90-day mortality in adult critically ill mechanically ventilated patients. METHODS: In this multicenter randomized trial, critically ill patients will be randomized to receive supplemental enteral protein (1.2 g/kg/day) added to standard enteral nutrition to achieve a high amount of enteral protein (range of 2-2.4 g/kg/day) or no supplemental enteral protein to achieve a moderate amount of enteral protein (0.8-1.2 g/kg/day). The primary outcome is 90-day all-cause mortality; other outcomes include functional and health-related quality-of-life assessments at 90 days. The study sample size of 2502 patients will have 80% power to detect a 5% absolute risk reduction in 90-day mortality from 30 to 25%. Consistent with international guidelines, this statistical analysis plan specifies the methods for evaluating primary and secondary outcomes and subgroups. Applying this statistical analysis plan to the REPLENISH trial will facilitate unbiased analyses of clinical data. CONCLUSION: Ethics approval was obtained from the institutional review board, Ministry of National Guard Health Affairs, Riyadh, Saudi Arabia (RC19/414/R). Approvals were also obtained from the institutional review boards of each participating institution. Our findings will be disseminated in an international peer-reviewed journal and presented at relevant conferences and meetings. TRIAL REGISTRATION: ClinicalTrials.gov, NCT04475666 . Registered on July 17, 2020.


Subject(s)
Critical Illness , Dietary Proteins , Enteral Nutrition , Multicenter Studies as Topic , Randomized Controlled Trials as Topic , Humans , Enteral Nutrition/methods , Dietary Proteins/administration & dosage , Data Interpretation, Statistical , Intensive Care Units , Quality of Life , Treatment Outcome , Respiration, Artificial , Time Factors
10.
BMC Med Res Methodol ; 24(1): 110, 2024 May 07.
Article in English | MEDLINE | ID: mdl-38714936

ABSTRACT

Bayesian statistics plays a pivotal role in advancing medical science by enabling healthcare companies, regulators, and stakeholders to assess the safety and efficacy of new treatments, interventions, and medical procedures. The Bayesian framework offers a unique advantage over the classical framework, especially when incorporating prior information into a new trial with quality external data, such as historical data or another source of co-data. In recent years, there has been a significant increase in regulatory submissions using Bayesian statistics due to its flexibility and ability to provide valuable insights for decision-making, addressing the modern complexity of clinical trials where frequentist trials are inadequate. For regulatory submissions, companies often need to consider the frequentist operating characteristics of the Bayesian analysis strategy, regardless of the design complexity. In particular, the focus is on the frequentist type I error rate and power for all realistic alternatives. This tutorial review aims to provide a comprehensive overview of the use of Bayesian statistics in sample size determination, control of type I error rate, multiplicity adjustments, external data borrowing, etc., in the regulatory environment of clinical trials. Fundamental concepts of Bayesian sample size determination and illustrative examples are provided to serve as a valuable resource for researchers, clinicians, and statisticians seeking to develop more complex and innovative designs.


Subject(s)
Bayes Theorem , Clinical Trials as Topic , Humans , Clinical Trials as Topic/methods , Clinical Trials as Topic/statistics & numerical data , Research Design/standards , Sample Size , Data Interpretation, Statistical , Models, Statistical
11.
Cogn Res Princ Implic ; 9(1): 27, 2024 05 03.
Article in English | MEDLINE | ID: mdl-38700660

ABSTRACT

The .05 boundary within Null Hypothesis Statistical Testing (NHST) "has made a lot of people very angry and been widely regarded as a bad move" (to quote Douglas Adams). Here, we move past meta-scientific arguments and ask an empirical question: What is the psychological standing of the .05 boundary for statistical significance? We find that graduate students in the psychological sciences show a boundary effect when relating p-values across .05. We propose this psychological boundary is learned through statistical training in NHST and reading a scientific literature replete with "statistical significance". Consistent with this proposal, undergraduates do not show the same sensitivity to the .05 boundary. Additionally, the size of a graduate student's boundary effect is not associated with their explicit endorsement of questionable research practices. These findings suggest that training creates distortions in initial processing of p-values, but these might be dampened through scientific processes operating over longer timescales.


Subject(s)
Statistics as Topic , Humans , Adult , Young Adult , Data Interpretation, Statistical , Male , Psychology , Female
12.
Am J Physiol Heart Circ Physiol ; 326(6): H1420-H1423, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38700473

ABSTRACT

The use of both sexes or genders should be considered in experimental design, analysis, and reporting. Since there is no requirement to double the sample size or to have sufficient power to study sex differences, challenges for the statistical analysis can arise. In this article, we focus on the topics of statistical power and ways to increase this power. We also discuss the choice of an appropriate design and statistical method and include a separate section on equivalence tests needed to show the absence of a relevant difference.


Subject(s)
Research Design , Humans , Data Interpretation, Statistical , Sample Size , Female , Male , Animals , Sex Factors , Models, Statistical
14.
Elife ; 122024 May 09.
Article in English | MEDLINE | ID: mdl-38722146

ABSTRACT

Imputing data is a critical issue for machine learning practitioners, including in the life sciences domain, where missing clinical data is a typical situation and the reliability of the imputation is of great importance. Currently, there is no canonical approach for imputation of clinical data and widely used algorithms introduce variance in the downstream classification. Here we propose novel imputation methods based on determinantal point processes (DPP) that enhance popular techniques such as the multivariate imputation by chained equations and MissForest. Their advantages are twofold: improving the quality of the imputed data demonstrated by increased accuracy of the downstream classification and providing deterministic and reliable imputations that remove the variance from the classification results. We experimentally demonstrate the advantages of our methods by performing extensive imputations on synthetic and real clinical data. We also perform quantum hardware experiments by applying the quantum circuits for DPP sampling since such quantum algorithms provide a computational advantage with respect to classical ones. We demonstrate competitive results with up to 10 qubits for small-scale imputation tasks on a state-of-the-art IBM quantum processor. Our classical and quantum methods improve the effectiveness and robustness of clinical data prediction modeling by providing better and more reliable data imputations. These improvements can add significant value in settings demanding high precision, such as in pharmaceutical drug trials where our approach can provide higher confidence in the predictions made.


Subject(s)
Algorithms , Machine Learning , Humans , Data Interpretation, Statistical , Reproducibility of Results
15.
PLoS One ; 19(5): e0303262, 2024.
Article in English | MEDLINE | ID: mdl-38753677

ABSTRACT

In recent years, concern has grown about the inappropriate application and interpretation of P values, especially the use of P<0.05 to denote "statistical significance" and the practice of P-hacking to produce results below this threshold and selectively reporting these in publications. Such behavior is said to be a major contributor to the large number of false and non-reproducible discoveries found in academic journals. In response, it has been proposed that the threshold for statistical significance be changed from 0.05 to 0.005. The aim of the current study was to use an evolutionary agent-based model comprised of researchers who test hypotheses and strive to increase their publication rates in order to explore the impact of a 0.005 P value threshold on P-hacking and published false positive rates. Three scenarios were examined, one in which researchers tested a single hypothesis, one in which they tested multiple hypotheses using a P<0.05 threshold, and one in which they tested multiple hypotheses using a P<0.005 threshold. Effects sizes were varied across models and output assessed in terms of researcher effort, number of hypotheses tested and number of publications, and the published false positive rate. The results supported the view that a more stringent P value threshold can serve to reduce the rate of published false positive results. Researchers still engaged in P-hacking with the new threshold, but the effort they expended increased substantially and their overall productivity was reduced, resulting in a decline in the published false positive rate. Compared to other proposed interventions to improve the academic publishing system, changing the P value threshold has the advantage of being relatively easy to implement and could be monitored and enforced with minimal effort by journal editors and peer reviewers.


Subject(s)
Models, Statistical , False Positive Reactions , Humans , Data Interpretation, Statistical
16.
Biometrics ; 80(2)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38768225

ABSTRACT

Conventional supervised learning usually operates under the premise that data are collected from the same underlying population. However, challenges may arise when integrating new data from different populations, resulting in a phenomenon known as dataset shift. This paper focuses on prior probability shift, where the distribution of the outcome varies across datasets but the conditional distribution of features given the outcome remains the same. To tackle the challenges posed by such shift, we propose an estimation algorithm that can efficiently combine information from multiple sources. Unlike existing methods that are restricted to discrete outcomes, the proposed approach accommodates both discrete and continuous outcomes. It also handles high-dimensional covariate vectors through variable selection using an adaptive least absolute shrinkage and selection operator penalty, producing efficient estimates that possess the oracle property. Moreover, a novel semiparametric likelihood ratio test is proposed to check the validity of prior probability shift assumptions by embedding the null conditional density function into Neyman's smooth alternatives (Neyman, 1937) and testing study-specific parameters. We demonstrate the effectiveness of our proposed method through extensive simulations and a real data example. The proposed methods serve as a useful addition to the repertoire of tools for dealing dataset shifts.


Subject(s)
Algorithms , Computer Simulation , Models, Statistical , Probability , Humans , Likelihood Functions , Biometry/methods , Data Interpretation, Statistical , Supervised Machine Learning
17.
Elife ; 132024 May 16.
Article in English | MEDLINE | ID: mdl-38752987

ABSTRACT

We discuss 12 misperceptions, misstatements, or mistakes concerning the use of covariates in observational or nonrandomized research. Additionally, we offer advice to help investigators, editors, reviewers, and readers make more informed decisions about conducting and interpreting research where the influence of covariates may be at issue. We primarily address misperceptions in the context of statistical management of the covariates through various forms of modeling, although we also emphasize design and model or variable selection. Other approaches to addressing the effects of covariates, including matching, have logical extensions from what we discuss here but are not dwelled upon heavily. The misperceptions, misstatements, or mistakes we discuss include accurate representation of covariates, effects of measurement error, overreliance on covariate categorization, underestimation of power loss when controlling for covariates, misinterpretation of significance in statistical models, and misconceptions about confounding variables, selecting on a collider, and p value interpretations in covariate-inclusive analyses. This condensed overview serves to correct common errors and improve research quality in general and in nutrition research specifically.


Subject(s)
Observational Studies as Topic , Research Design , Humans , Research Design/standards , Models, Statistical , Data Interpretation, Statistical
18.
Trials ; 25(1): 286, 2024 Apr 27.
Article in English | MEDLINE | ID: mdl-38678289

ABSTRACT

BACKGROUND: The fragility index is a statistical measure of the robustness or "stability" of a statistically significant result. It has been adapted to assess the robustness of statistically significant outcomes from randomized controlled trials. By hypothetically switching some non-responders to responders, for instance, this metric measures how many individuals would need to have responded for a statistically significant finding to become non-statistically significant. The purpose of this study is to assess the fragility index of randomized controlled trials evaluating opioid substitution and antagonist therapies for opioid use disorder. This will provide an indication as to the robustness of trials in the field and the confidence that should be placed in the trials' outcomes, potentially identifying ways to improve clinical research in the field. This is especially important as opioid use disorder has become a global epidemic, and the incidence of opioid related fatalities have climbed 500% in the past two decades. METHODS: Six databases were searched from inception to September 25, 2021, for randomized controlled trials evaluating opioid substitution and antagonist therapies for opioid use disorder, and meeting the necessary requirements for fragility index calculation. Specifically, we included all parallel arm or two-by-two factorial design RCTs that assessed the effectiveness of any opioid substitution and antagonist therapies using a binary primary outcome and reported a statistically significant result. The fragility index of each study was calculated using methods described by Walsh and colleagues. The risk of bias of included studies was assessed using the Revised Cochrane Risk of Bias tool for randomized trials. RESULTS: Ten studies with a median sample size of 82.5 (interquartile range (IQR) 58, 179, range 52-226) were eligible for inclusion. Overall risk of bias was deemed to be low in seven studies, have some concerns in two studies, and be high in one study. The median fragility index was 7.5 (IQR 4, 12, range 1-26). CONCLUSIONS: Our results suggest that approximately eight participants are needed to overturn the conclusions of the majority of trials in opioid use disorder. Future work should focus on maximizing transparency in reporting of study results, by reporting confidence intervals, fragility indexes, and emphasizing the clinical relevance of findings. TRIAL REGISTRATION: PROSPERO CRD42013006507. Registered on November 25, 2013.


Subject(s)
Narcotic Antagonists , Opiate Substitution Treatment , Opioid-Related Disorders , Randomized Controlled Trials as Topic , Humans , Analgesics, Opioid/therapeutic use , Analgesics, Opioid/adverse effects , Data Interpretation, Statistical , Narcotic Antagonists/therapeutic use , Narcotic Antagonists/adverse effects , Opiate Substitution Treatment/methods , Opioid-Related Disorders/drug therapy , Research Design , Treatment Outcome
19.
Stat Med ; 43(12): 2452-2471, 2024 May 30.
Article in English | MEDLINE | ID: mdl-38599784

ABSTRACT

Many longitudinal studies are designed to monitor participants for major events related to the progression of diseases. Data arising from such longitudinal studies are usually subject to interval censoring since the events are only known to occur between two monitoring visits. In this work, we propose a new method to handle interval-censored multistate data within a proportional hazards model framework where the hazard rate of events is modeled by a nonparametric function of time and the covariates affect the hazard rate proportionally. The main idea of this method is to simplify the likelihood functions of a discrete-time multistate model through an approximation and the application of data augmentation techniques, where the assumed presence of censored information facilitates a simpler parameterization. Then the expectation-maximization algorithm is used to estimate the parameters in the model. The performance of the proposed method is evaluated by numerical studies. Finally, the method is employed to analyze a dataset on tracking the advancement of coronary allograft vasculopathy following heart transplantation.


Subject(s)
Algorithms , Heart Transplantation , Proportional Hazards Models , Humans , Likelihood Functions , Heart Transplantation/statistics & numerical data , Longitudinal Studies , Computer Simulation , Models, Statistical , Data Interpretation, Statistical
20.
Adv Rheumatol ; 64(1): 31, 2024 Apr 22.
Article in English | MEDLINE | ID: mdl-38650049

ABSTRACT

BACKGROUND: To illustrate how (standardised) effect sizes (ES) vary based on calculation method and to provide considerations for improved reporting. METHODS: Data from three trials of tanezumab in subjects with osteoarthritis were analyzed. ES of tanezumab versus comparator for WOMAC Pain (outcome) was defined as least squares difference between means (mixed model for repeated measures analysis) divided by a pooled standard deviation (SD) of outcome scores. Three approaches to computing the SD were evaluated: Baseline (the pooled SD of WOMAC Pain values at baseline [pooled across treatments]); Endpoint (the pooled SD of these values at the time primary endpoints were assessed); and Median (the median pooled SD of these values based on the pooled SDs across available timepoints). Bootstrap analyses were used to compute 95% confidence intervals (CI). RESULTS: ES (95% CI) of tanezumab 2.5 mg based on Baseline, Endpoint, and Median SDs in one study were - 0.416 (- 0.796, - 0.060), - 0.195 (- 0.371, - 0.028), and - 0.196 (- 0.373, - 0.028), respectively; negative values indicate pain improvement. This pattern of ES differences (largest with Baseline SD, smallest with Endpoint SD, Median SD similar to Endpoint SD) was consistent across all studies and doses of tanezumab. CONCLUSION: Differences in ES affect interpretation of treatment effect. Therefore, we advocate clearly reporting individual elements of ES in addition to its overall calculation. This is particularly important when ES estimates are used to determine sample sizes for clinical trials, as larger ES will lead to smaller sample sizes and potentially underpowered studies. TRIAL REGISTRATION: Clinicaltrials.gov NCT02697773, NCT02709486, and NCT02528188.


Subject(s)
Antibodies, Monoclonal, Humanized , Osteoarthritis , Randomized Controlled Trials as Topic , Humans , Antibodies, Monoclonal, Humanized/therapeutic use , Data Interpretation, Statistical , Osteoarthritis/drug therapy , Pain Measurement , Treatment Outcome
SELECTION OF CITATIONS
SEARCH DETAIL
...